Search CORE

57 research outputs found

Penalized Orthogonal-Components Regression for Large p Small n Data

Author: Lin Yanzhu
Zhang Dabao
Zhang Min
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 17/12/2008
Field of study

We propose a penalized orthogonal-components regression (POCRE) for large p small n data. Orthogonal components are sequentially constructed to maximize, upon standardization, their correlation to the response residuals. A new penalization framework, implemented via empirical Bayes thresholding, is presented to effectively identify sparse predictors of each component. POCRE is computationally efficient owing to its sequential construction of leading sparse principal components. In addition, such construction offers other properties such as grouping highly correlated predictors and allowing for collinear or nearly collinear predictors. With multivariate responses, POCRE can construct common components and thus build up latent-variable models for large p small n data.Comment: 12 page

arXiv.org e-Print Archive

Crossref

Coefficients of Determination for Mixed-Effects Models

Author: Zhang Dabao
Publication venue
Publication date: 18/06/2021
Field of study

The coefficient of determination is well defined for linear models and its extension is long wanted for mixed-effects models. We revisit its extension to define measures for proportions of variation explained by the whole model, fixed effects only, and random effects only. We propose to calculate unexplained variations conditional on individual random and/or fixed effects so as to keep individual heterogeneity brought by available predictors. While naturally defined for linear mixed models, these measures can be defined for a generalized linear mixed model using a distance measured along its variance function, accounting for its heteroscedasticity

arXiv.org e-Print Archive

Brain APOE expression quantitative trait loci-based association study identified one susceptibility locus for Alzheimer\u27s disease by interacting with APOE epsilon 4

Author: Jiang Shan
Xu Dabao
Zhang Aiqian
Zhao Qingnan
Publication venue: Digital Commons@Becker
Publication date: 01/01/2018
Field of study

Digital Commons@Becker

Case-control genome-wide association study of rheumatoid arthritis from Genetic Analysis Workshop 16 using penalized orthogonal-components regression-linear discriminant analysis

Author: Fleet James C
Lin Yanzhu
Pungpapong Vitara
Wang Libo
Zhang Dabao
Zhang Min
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Currently, genome-wide association studies (GWAS) are conducted by collecting a massive number of SNPs (i.e., large p) for a relatively small number of individuals (i.e., small n) and associations are made between clinical phenotypes and genetic variation one single-nucleotide polymorphism (SNP) at a time. Univariate association approaches like this ignore the linkage disequilibrium between SNPs in regions of low recombination. This results in a low reliability of candidate gene identification. Here we propose to improve the case-control GWAS approach by implementing linear discriminant analysis (LDA) through a penalized orthogonal-components regression (POCRE), a newly developed variable selection method for large p small n data. The proposed POCRE-LDA method was applied to the Genetic Analysis Workshop 16 case-control data for rheumatoid arthritis (RA). In addition to the two regions on chromosomes 6 and 9 previously associated with RA by GWAS, we identified SNPs on chromosomes 10 and 18 as potential candidates for further investigation

Springer - Publisher Connector

PubMed Central

Inferring Gene Regulatory Networks from a Population of Yeast Segregants

Author: Chen Chen
Hazbun Tony R
Zhang Dabao
Zhang Min
Publication venue: 'Purdue University (bepress)'
Publication date: 04/02/2019
Field of study

Constructing gene regulatory networks is crucial to unraveling the genetic architecture of complex traits and to understanding the mechanisms of diseases. On the basis of gene expression and single nucleotide polymorphism data in the yeast, Saccharomyces cerevisiae, we constructed gene regulatory networks using a two-stage penalized least squares method. A large system of structural equations via optimal prediction of a set of surrogate variables was established at the first stage, followed by consistent selection of regulatory effects at the second stage. Using this approach, we identified subnetworks that were enriched in gene ontology categories, revealing directional regulatory mechanisms controlling these biological pathways. Our mapping and analysis of expression-based quantitative trait loci uncovered a known alteration of gene expression within a biological pathway that results in regulatory effects on companion pathway genes in the phosphocholine network. In addition, we identify nodes in these gene ontology-enriched subnetworks that are coordinately controlled by transcription factors driven by trans-acting expression quantitative trait loci. Altogether, the integration of documented transcription factor regulatory associations with subnetworks defined by a system of structural equations using quantitative trait loci data is an effective means to delineate the transcriptional control of biological pathways

Directory of Open Access Journals

Purdue E-Pubs

Genome-wide association analysis of GAW17 data using an empirical Bayes variable selection

Author: D Zhang
Dabao Zhang
IM Johnstone
IM Johnstone
LA Almasy
Libo Wang
Min Zhang
S Purcell
Vitara Pungpapong
Yanzhu Lin
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Next-generation sequencing technologies enable us to explore rare functional variants. However, most current statistical techniques are too underpowered to capture signals of rare variants in genome-wide association studies. We propose a supervised coalescing of single-nucleotide polymorphisms to obtain gene-based markers that can stably reveal possible genetic effects related to rare alleles. We use a newly developed empirical Bayes variable selection algorithm to identify associations between studied traits and genetic markers. Using our novel method, we analyzed the three continuous phenotypes in the GAW17 data set across 200 replicates, with intriguing results

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central